Name | Version | Summary | date |
atai-pdf-tool |
0.1.0 |
A tool for parsing and extracting text from PDF files with OCR capabilities |
2025-02-27 11:15:46 |
kreuzberg |
2.1.0 |
A text extraction library supporting PDFs, images, office documents and more |
2025-02-20 15:26:43 |
fileseek |
0.1.3 |
FileSeek – AI-Powered Local Document Archive&Search |
2025-02-08 07:13:54 |
tikara |
0.1.5 |
The metadata and text content extractor for almost every file type. |
2025-01-26 23:33:40 |
pdf-parser-header-footer |
0.1.0 |
A Python package for processing PDFs with header and footer detection |
2025-01-14 16:10:34 |
spanish-pdf-parser |
0.1.0 |
A Python package for processing PDFs with header and footer detection |
2025-01-13 14:56:27 |
vlense |
0.1.4 |
A Python package to extract text from images and PDFs using Vision Language Model (VLM). |
2024-11-06 10:51:15 |
trafilatura |
1.12.2 |
Python package and command-line tool designed to gather text on the Web, includes all necessary discovery and text processing components to perform web crawling, downloads, scraping, and extraction of main texts, metadata and comments. |
2024-09-10 12:42:48 |